离散选择模型(DCM)需要先验了解实用程序功能,尤其是在个人之间的味道如何变化。公用事业错误指定可能会导致估计偏差,解释不准确和可预测性有限。在本文中,我们利用神经网络来学习味觉表示。我们的公式由两个模块组成:一个神经网络(味觉),该模块将口味参数(例如时间系数)作为个体特征的灵活函数;以及具有用专家知识定义的实用程序函数的多项式logit(MNL)模型。神经网络学到的口味参数被馈送到选择模型中,并将两个模块链接起来。我们的方法通过允许神经网络学习个体特征和替代属性之间的相互作用来扩展L-MNL模型(Sifringer等,2020)。此外,我们正式化并加强了可解释性条件 - 需要对分类级别的行为指标(例如,时间值,弹性)进行现实估计,这对于模型对于场景分析和政策决策至关重要。通过唯一的网络体系结构和参数转换,我们合并了先验知识,并指导神经网络在分类级别输出现实的行为指标。我们表明,TasteNet-MNL达到了基础真相模型的可预测性,并在合成数据上恢复了非线性味觉功能。它在个人层面上的估计值和选择弹性接近地面真相。在公开可用的瑞士梅特罗数据集中,TasteNet-MNL优于基准MNL和混合Logit模型的可预测性。它学习了人群中各种各样的味道变化,并提出了更高的平均值。
translated by 谷歌翻译
在罕见的时间条件下(例如,公共假期,学校假期等)预测旅行时间是由于历史数据的限制而构成的挑战。如果所有可用的话,历史数据通常会形成异质时间序列,这是由于长时间其他变化的可能性很高(例如,道路工程,引入的交通镇定计划等)。这在城市和郊区特别突出。我们提出了一个用于编码罕见时间条件的矢量空间模型,该模型允许在不同时间条件上进行连贯的表示。当利用矢量空间编码来表示时间设置时,我们显示出对不同基线的旅行时间预测的性能提高。
translated by 谷歌翻译
共享的移动服务需要准确的需求模型来有效的服务计划。一方面,对需求的全部概率分布进行建模是有利的,因为整个不确定性结构保留了有价值的决策信息。另一方面,经常通过使用服务本身来观察需求,以便对观察结果进行审查,因为它们本质上受到可用供应的限制。自1980年代以来,在这种条件下,对审查的分位数回归模型进行了各种审查的作品。此外,在过去的二十年中,有几篇论文提出了通过神经网络灵活实施这些模型的论文。但是,当前工作中的模型会单独估算分位数,从而产生了计算开销,并忽略了分位数之间的宝贵关系。我们通过扩展当前审查的分位数回归模型一次以一次学习多个分位数来解决这一差距,并将其应用于丹麦哥本哈根大都会区的两个共享移动性提供商的合成基线数据集和数据集。结果表明,我们的扩展模型在不损害模型性能的情况下产生的分位数较少,计算开销较少。
translated by 谷歌翻译
鉴于诸如相关风险和道德问题等潜在影响,人工智能(AI)等先进技术的规定变得越来越重要。由于能够首先提供这种技术,安全预防措施和社会后果所承诺的巨大福利可以忽略或换档以换取加快发展,因此在开发人员之间发挥赛车叙事。从一个游戏理论模型开始,描述了一个在一个混合的球员世界的理想化技术比赛中,我们调查了种族参与者之间的不同互动结构如何改变集体选择和对监管行为的要求。我们的研究结果表明,当参与者在连接和同伴影响方面描绘了强大的多样性时(例如,当缔约方之间的无垢网络形状相互作用)时,均匀设置中存在的冲突显着降低,从而减少了对监管的需求行动。此外,我们的结果表明,技术治理和监管可能从公司和国家之间的专利异质性和不平等中获利,以便能够对少数参与者进行细致的干预措施,这能够影响整个人口一种道德和可持续利用先进技术。
translated by 谷歌翻译
随着公共交通(PT)变得更加活跃,响应,越来越多地取决于运输需求的预测。但是,这种预测需要准确的是有效的PT操作?我们通过对丹麦大都市哥本哈根的PT Trips进行实验案例研究,我们独立于任何特定预测模型进行PT Trips的实验案例研究。首先,我们通过形状变化的无偏噪声分布来模拟需求预测的误差。使用嘈杂的预测,我们通过线性编程配方模拟和优化需求响应的PT车队,并测量它们的性能。我们的研究结果表明,优化的性能主要受噪声分布歪斜的影响以及不经常预测误差的存在。特别地,优化的性能可以在非高斯与高斯噪声下改进。我们还发现动态路由可以减少速度时间至少23%与静态路由。根据案例研究的旅行时间的价值,此减少估计为809,000欧元/年。
translated by 谷歌翻译
Numerous works use word embedding-based metrics to quantify societal biases and stereotypes in texts. Recent studies have found that word embeddings can capture semantic similarity but may be affected by word frequency. In this work we study the effect of frequency when measuring female vs. male gender bias with word embedding-based bias quantification methods. We find that Skip-gram with negative sampling and GloVe tend to detect male bias in high frequency words, while GloVe tends to return female bias in low frequency words. We show these behaviors still exist when words are randomly shuffled. This proves that the frequency-based effect observed in unshuffled corpora stems from properties of the metric rather than from word associations. The effect is spurious and problematic since bias metrics should depend exclusively on word co-occurrences and not individual word frequencies. Finally, we compare these results with the ones obtained with an alternative metric based on Pointwise Mutual Information. We find that this metric does not show a clear dependence on frequency, even though it is slightly skewed towards male bias across all frequencies.
translated by 谷歌翻译
Candidate axiom scoring is the task of assessing the acceptability of a candidate axiom against the evidence provided by known facts or data. The ability to score candidate axioms reliably is required for automated schema or ontology induction, but it can also be valuable for ontology and/or knowledge graph validation. Accurate axiom scoring heuristics are often computationally expensive, which is an issue if you wish to use them in iterative search techniques like level-wise generate-and-test or evolutionary algorithms, which require scoring a large number of candidate axioms. We address the problem of developing a predictive model as a substitute for reasoning that predicts the possibility score of candidate class axioms and is quick enough to be employed in such situations. We use a semantic similarity measure taken from an ontology's subsumption structure for this purpose. We show that the approach provided in this work can accurately learn the possibility scores of candidate OWL class axioms and that it can do so for a variety of OWL class axioms.
translated by 谷歌翻译
This paper proposes a question-answering system that can answer questions whose supporting evidence is spread over multiple (potentially long) documents. The system, called Visconde, uses a three-step pipeline to perform the task: decompose, retrieve, and aggregate. The first step decomposes the question into simpler questions using a few-shot large language model (LLM). Then, a state-of-the-art search engine is used to retrieve candidate passages from a large collection for each decomposed question. In the final step, we use the LLM in a few-shot setting to aggregate the contents of the passages into the final answer. The system is evaluated on three datasets: IIRC, Qasper, and StrategyQA. Results suggest that current retrievers are the main bottleneck and that readers are already performing at the human level as long as relevant passages are provided. The system is also shown to be more effective when the model is induced to give explanations before answering a question. Code is available at \url{https://github.com/neuralmind-ai/visconde}.
translated by 谷歌翻译
Heteroscedastic regression models a Gaussian variable's mean and variance as a function of covariates. Parametric methods that employ neural networks for these parameter maps can capture complex relationships in the data. Yet, optimizing network parameters via log likelihood gradients can yield suboptimal mean and uncalibrated variance estimates. Current solutions side-step this optimization problem with surrogate objectives or Bayesian treatments. Instead, we make two simple modifications to optimization. Notably, their combination produces a heteroscedastic model with mean estimates that are provably as accurate as those from its homoscedastic counterpart (i.e.~fitting the mean under squared error loss). For a wide variety of network and task complexities, we find that mean estimates from existing heteroscedastic solutions can be significantly less accurate than those from an equivalently expressive mean-only model. Our approach provably retains the accuracy of an equally flexible mean-only model while also offering best-in-class variance calibration. Lastly, we show how to leverage our method to recover the underlying heteroscedastic noise variance.
translated by 谷歌翻译
Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.
translated by 谷歌翻译